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Abstract — The backpressure routing and scheduling, with 
throughput-optimal operation guarantee, is a promising tech- 
nique to improve throughput over wireless multi-hop networks. 
Although the backpressure framework is conceptually viewed 
as layered, the decisions of routing and scheduling are made 
jointly, which imposes several challenges in practice. In this 
work, we present Diff-Max, an approach that separates routing 
and scheduling and has three strengths: (i) Diff-Max improves 
throughput significantly, (ii) the separation of routing and 
scheduling makes practical implementation easier by minimizing 
cross-layer operations; i.e., routing is implemented in the network 
layer and scheduling is implemented in the link layer, and (iii) 
the separation of routing and scheduling leads to modularity; 
ue., routing and scheduling are independent modules in Diff- 
Max and one can continue to operate even if the other does not. 
Our approach is grounded in a network utility maximization 
(NUM) formulation of the problem and its solution. Based on 
the structure of Diff-Max, we propose two practical schemes: 
Diff-subMax and wDiff-subMax. We demonstrate the benefits of 
our schemes through simulation in ns-2, and we implement a 
prototype on smartphones. 

I. Introduction 

The backpressure routing and scheduling paradigm has 
emerged from the pioneering work in UJ, |2|, which showed 
that, in wireless networks where nodes route packets and make 
scheduling decisions based on queue backlog differences, one 
can stabilize queues for any feasible traffic. This seminal idea 
has generated a lot of research interest. Most importantly; it 
has been shown that backpressure can be combined with flow 
control to provide utility-optimal operation guarantee Q. 

The strengths of these techniques have recently increased 
the interest on practical implementation of backpressure 
framework over wireless networks, some of which are sum- 
marized in Section [VT] However, the practical implementation 
of backpressure imposes several challenges mainly due to the 
joint nature of the routing and scheduling algorithms, which 
is the focus of this paper. 

In classical backpressure, each node constructs per-flow 
queues. Based on the per-flow queue backlog differences, 
and by taking into account the state of the network, each 
node makes routing and scheduling decisions. Although the 
backpressure framework is conceptually viewed as layered, the 
decisions of routing and scheduling are made jointly. To better 
illustrate this key point, let us discuss the following example. 

This work was supported by NSF grant CNS-0915988, ONR grant N00014- 
12-1-0064, ARO Muri grant number W91 1NF-08-1-0238. 
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Fig. 1. Example topology consisting of three nodes; i, j, k, and two flows; 1, 
2. Note that this small topology is a zoomed part of a large multi-hop wireless 
network. The source and destination nodes of flows 1 and 2 are not shown 
in this example, i.e., nodes i, j, k are intermediate nodes which route and 
schedule flows 1 and 2. U\ and Uf are per-flow queue sizes and Vi j and Vj fc 
are per-link queue sizes, (a) Backpressure: Node i determines queue backlog 
differences at time t; Df ^t) = Uf(t) - U?(t), £>? fc (t) = V?(t) - U°(t), 
where s £ {1, 2}. Based on these differences as well as the channel state of 
the network, C(t), it makes joint routing and scheduling decisions, (b) Diff- 
Max: Node i makes routing decision based on the queue backlog differences 
at time t; D?Jt) = Uf(t) - U?(t) - V^(t), D| fc (t) = U?(t) - U> k (t) - 
Vi fc(t), where s £ {1, 2}. Separately, node i makes the scheduling decision 
based on Vi j(f), Vi k(t) and C(t). 

Example 1: Let us consider Fig. 02a) for backpressure 
operation. At time t, node i makes routing and scheduling 
decisions for flows 1 and 2 based on the per-flow queue sizes; 
U}(t), U?(t), as well as the queue sizes of the other nodes, 
i.e., node j and k in this example, and using the channel 
state of the network C(t). In particular, the backpressure 
determines the flow that should be transmitted over link i—j 
by s* = argmax{D^. (*),£?,,.(*)} such that s* G {1,2}. 
The decision mechanism is the same for link i — k. Note 
that this is joint routing (i.e., the next hop decision) and 
scheduling (i.e., the flow selection for transmission). The 
scheduling algorithm also determines the link activation policy. 
In particular, the maximum backlog differences over each link 
are calculated as; D^t) = Df'^t) and D* fc (t) = D( k (t). 
Based on D* -(f), D* k (t) and C(t), the scheduling algorithm 
determines the link that should be activated. Note that the de- 
cisions of routing and scheduling (also named as max-weight 
algorithm) are made jointly in the backpressure framework, 
which imposes several challenges in practice. We elaborate 
on them next. ■ 

Routing algorithms are traditionally designed in the network 



layer, while the scheduling algorithms are implemented in 
the link layer in current networks. However, the joint routing 
and scheduling nature of backpressure imposes challenges for 
practical implementation. To deal with these challenges, J4] 
implements the backpressure at the link layer, Q proposes 
a system in the MAC layer. This approach is practically 
difficult due to device memory limitations and strict limitations 
imposed by device firmware and drivers not to change the link 
layer functionalities. The second approach is to implement 
backpressure in (or below) the network layer, (6), Q, (8). 
This approach requires joint operation of the network and 
link layers, so that the backpressure framework gracefully 
work with the link layer. Therefore, the network and link 
layers should work together synchronously, which may not 
be practical for many off-the-shelf devices. 

Existing networks are designed in layers, in which protocols 
and algorithms are modular and operate independently at each 
layer of the protocol stack. E.g., routing algorithms at the 
network layer should work in a harmony with different types 
of scheduling algorithms in the link layer. However, the joint 
nature of the backpressure stresses joint operation and hurts 
modularity, which is especially important in contemporary 
wireless networks, which may vary from a few node networks 
to ones with hundreds of nodes. It is natural to expect that 
different types of networks, according to their size as well 
as software and hardware limitations, may choose to employ 
backpressure partially or fully. E.g., some networks may be 
able to employ both routing and scheduling algorithms, while 
others may only employ routing. Therefore, the algorithms of 
backpressure, i.e., routing and scheduling should be modular. 

In this paper, we are interested in a framework in which the 
routing and scheduling are separated. We seek to find such 
a scheme where routing is performed independently at the 
network layer and scheduling decisions are performed at the 
link layer. The key ingredients of our approach, which we call 
Diff-MaxQ, are; (i) per-fiow queues at the network layer and 
making routing decision based on their differences, (ii) per- 
link queues at the link layer and making scheduling decision 
based on their size. 

Example 1 - continued: Let us consider Fig. [TJb) for Diff- 
Max operation, (i) Routing: at time t, node i makes routing 
decision for flows 1 and 2 based on queue backlogs Dfj(t) 
and Df k (t), where s <E {1,2}. This decision is made at 
the network layer and the routed packets are inserted in 
the link layer queues. Note that in classical backpressure, 
routed packets are scheduled jointly, i.e., when a packet is 
routed, it should be transmitted if the corresponding links are 
activated. Hence, both algorithms should make decision jointly 
in classical backpressure. However, in our scheme, a packet 
may be routed at time t, and scheduled and transmitted at a 
later time t + T where T > 0. (ii) Scheduling: at the link layer, 

'The rationale behind the name of our scheme, i.e., Diff-Max is as follows. 
Diff means that the routing part is based on queue Preferences, and Max refers 
to the fact that the scheduling part is based on the maximum of the (weighted) 
link layer queues. Finally, the hyphen in Diff-Max is to mention the separated 
nature of the routing and scheduling algorithms. 



links are activated and packets are transmitted based on per- 
link queue sizes; Vij, Vi.k, and C(t). The details of Diff-Max 
are provided in Section [III] ■ 
Our approach is grounded in a network utility maximiza- 
tion (NUM) framework J9j. The solution decomposes into 
several parts with an intuitive interpretation, such as routing, 
scheduling, and flow control. The structure of the NUM 
solution provides insight into the design of our scheme, 
Diff-Max. Thanks to separating routing and scheduling, Diff- 
Max makes the practical implementation easier and minimizes 
cross-layer operations. We also propose two practical schemes; 
Diff-subMax and wDiff-subMax. The following are the key 
contributions of this work: 

• We propose a new system model and NUM framework 
to separate routing and scheduling. Our solution to the 
NUM problem, separates routing and scheduling such 
that routing is implemented at the network layer, and 
scheduling is at the link layer. Based on the structure 
of the NUM solution, we propose Diff-Max. 

> We extend Diff-Max to employ routing and scheduling 
parts, but disable the link activation part of the scheduling 
algorithm. We call the new framework Diff-subMax, 
which reduces computational complexity and overhead 
significantly, and provides high throughput improvements 
in practice. Namely, Diff-subMax only needs information 
from one-hop away neighbors to make its routing and 
scheduling decisions. 

• We propose a window-based routing mechanism, wDiff- 
subMax, which implements routing, but disables the 
scheduling. wDiff-subMax is designed for the scenarios, 
in which the implementation of the scheduling algorithm 
in the link layer is impossible (or not preferable) due to 
device restrictions. wDiff-subMax makes routing decision 
on the fly, and minimizes overhead. 

• We evaluate our schemes in a multi-hop setting and con- 
sider their interaction with transport, network, and link 
layers. We perform numerical calculations confirming 
that Diff-Max is as good as backpressure. We implement 
our schemes in a simulator; ns-2 iflOl . and show that they 
significantly improve throughput as compared to adaptive 
routing schemes such as Ad hoc On-Demand Distance 
Vector (AODV) ifTTl . Finally, we implemented a pro- 
totype of wDiff-subMax on Galaxy Nexus smartphones 
with Android 4.0 (Ice Cream Sandwich) lfT2l . 

The structure of the rest of the paper is as follows. SectionHTl 
gives an overview of the system model. Section|Ill]presents the 
NUM formulation and solution. Section ||V]presents the design 
and development of Diff-Max schemes and their interaction 
with the protocol stack. Section [V] presents simulation results. 
Section [VI] presents related work. Section I VIII concludes the 
paper. n System overview 

We consider multi-hop wireless networks, in which packets 
from a source traverse potentially multiple wireless hops 
before being received by their receiver. In this setup, each 
wireless node is able to perform routing, scheduling, and flow 




Fig. 2. A wireless mesh network. The queues at the network and link layers, 
and the interaction among the queues, inside node i are shown here in detail. 
£7? and U" are the network layer queues for flows s and s', and Vij and 
Vi i are the per-link queues for links; i — j and i — I. Diff-Max algorithm 
makes the routing decision in the network layer, and the scheduling decision 
in the link layer. 

control. In this section, we provide an overview of this setup 
and highlight some of its key characteristics. Fig. [2] shows the 
key parts of our system model in an example topology. 

A. Notation and Setup 

The wireless network consists of N nodes and L edges, 
where Af is the set of nodes and C is the set of edges in 
the network. We consider in our formulation and analysis that 
time is slotted, and t refers to the beginning of slot t. 

1) Sources and Flows: Let S be the set of unicast flows 
between source-destination pairs in the network. Each flow s G 
iS arrives from the application layer to the transport layer with 
rate A s (t), Vs G S at time slot t. The arrival rates are i.i.d. over 
the slots and their expected values are; A s = E[A s (t)], Vs G S, 
and E[A s {t) 2 ] are finite. Transport layer stores the arriving 
packets in reservoirs {i.e., transport layer per-flow queues), 
and controls the flow traffic. In particular, each source s is 
associated with rate x s considering a utility function g s (x s ), 
which we assume to be a strictly concave function of x s . The 
transport layer determines x s (t) at time slot t according to 
the utility function g s . x s (t) packets are transmitted from the 
transport layer reservoir to the network layer at slot t. 

2) Queue Structures: At node i G Af, there are network 
and link layer queues. The network layer queues are per-flow 
queues; i.e., U? is the queue at node i G Af that only stores 
packets from flow s G S. The link layer queues are per-link 
queues; i.e., at each node i G Af, a link layer queue Vij is 
constructed for each neighbor node j £ Af (Fig. |2}@ 

3) Flow Rates: Our model optimizes the flow rates among 
different nodes as well as the flow rates in a node among 
different layers; transport, network, and link layer. 

The transport layer determines x s (t) at time t, and passes 
x s (t) packets to the network layer. These packets are inserted 
in the network layer queue; Uf (assuming that node i is the 

2 Note that in some devices, there might be only one queue (per-node queue) 
for data transmission instead of per-link queues in the link layer. Developing 
a model with per-node queues is challenging due to coupling among actions 
and states, so it is an open problem. 



source node of flow s). The network layer may also receive 
packets from the other nodes and insert them in Uf. The link 
transmission rate is hk.i(t) at time t. hk : i{t) is larger than (or 
equal to) per-flow data rates over link k — i. E.g., we can write 
for Fig. PJ that h k ,i(t) > h s k l (t) + where h s k l (t) is the 

data rate of flow s over link k — i. Note that hf. 4 (t) is the 
actual data transmission rate of flow s over link k — i, while 
hk,i{t) is the available rate over link k — i, at time t. At every 
timeslot t, Uf changes according to the following dynamics. 

Uf(t + 1) = nu«[ff/(t) - /&(*)> °1 + E 

+ a: s (i)l[ i=0 ( s )] (1) 

where o(s) is the source node of flow s and lj =0 ( s ) is an 
indicator function, which is 1 if i = o(s), and 0, otherwise. 

The data rate from the network layer to the link layer 
queues is ffj(t). In particular, ffj{i) is the actual rate of the 
packets, belonging to flow s, from the network layer queue; 
Uf to the link layer queue; Vi.j at node i. Note that the 
optimization of flow rate ffj (t) is the routing decision, since 
it basically determines how many packets from flow s should 
be forwarded (hence routed) to node j. At every timeslot t, 
Vij changes according to the following queue dynamics. 

V id {t + 1) = max[^-(t) - h id (t), 0] + /«(*) ( 2 ) 

The link transmission rate from i to node j is hij(t). As 
mentioned above hij(t) upper bounds per-flow data rates; 
i.e., hij(t) > 2~2ses j(t)- Note that the optimization of 
link transmission rate hi j(t) corresponds to the scheduling 
decisions, since it determines which packets from which link 
layer queues should be transmitted as well as whether a link 
is activated. 

B. Channel Model and Capacity Region 

1) Channel Model: Consider one-hop transmission over 
link I, where / = such that G Af and i ^ j. 
At each slot t, C(t) is the channel state vector, where 
C{t) = {d(t), ...,Ci(t), ...,C L (t)}. Gi{t) is the state of the 
link I at time t and takes values from the set {ON, OFF} 
according to a probability distribution which is i.i.d. over time 
slots. If Ci(t) = ON, packets are transmitted with rate Ri. 
Otherwise; (i.e., if Ci(t) = OFF), no packets are transmitted. 

rWj) denote the set of the link transmission rates feasible 
at time slot t and for channel state C(t). In particular, 
at every timeslot t, the link transmission vector h(t) = 
{hi(t), hi(t), .../ix(t)} should be constrained such that 
h(t) g r C (ty 

2) Capacity Region: Let (A s ) is the vector of arrival rates 
Vs G S. The network layer capacity region A is defined as the 
closure of all arrival vectors that can be stably transmitted in 
the network, considering all possible routing and scheduling 
policies 12, J2], J3). A is fixed and depends only on channel 
statistics characterized by ^c(t)- 



III. Diff-Max: Formulation and Design 



A. Network Utility Maximization 

In this section, we formulate and design the Diff-Max 
framework. Our first step is the NUM formulation of the prob- 
lem and its solution. This approach {i.e., NUM formulation 
and its solution) sheds light into the structure of the Diff- 
Max algorithms. Note that the NUM formulation optimizes 
the average values of the parameters {i.e., flow rates) that are 
defined in Section [II] By abuse of notation, we use a variable, 
e.g., <f> as the average value <j>(t) in our NUM formulation, if 
both <j> and <f>(t) refers to the same parameter. 

1) Formulation: Our objective is to maximize the total 
utility function by optimally choosing the flow rates x s , 
Vs G S, as well as the following variables at each node: the 
amount of data traffic that should be routed to each neighbor 
node; i.e., ffp the link transmission rates; i.e., hij. 

max g s (x s ) 

x, f.h.T z — J 

seS 



2) Solution: By relaxing the first two flow conservation 
constraints in Eq. (fj), we have: 

L(x, f, h, u,v)=J2 g s (x s ) + U >{J2 fiJ 

ses ieAf ses jeAf 

- Y, h h - ^ 1 [i=°(»)i) - J2 v ^ {J2 fli - hj) > 

(i.j)ec ses 



jeAf 



(4) 



where uf and Vij are the Lagrange multipliers, which can be 
interpreted as the representative of the network and link layer 
queues, Uf and Vij, respectively!! The Lagrange function can 
be re-written as; 



s.t. 




/ j Jh3 

jeAf jeAf 
ses 



if i = o(s) 
otherwise 



Vi e AT, s e 



f s = 
hef 



hi d ,VaeS, (i,j)eC 



(3) 



The first constraint is the flow conservation constraint at 
the network layer: at every node i and for each flow s, 
the sum of the total incoming traffic, i.e., 2~2jeAf i an< ^ 
exogenous traffic, i.e., x s should be equal to the total outgoing 
traffic from the network layer, i.e., 2~2jeAf ft r ^ ne secon d 
constraint is also the flow conservation constraint, but at the 
link layer; the link transmission rate; i.e., hij should be 
larger than the incoming traffic; i.e., J^sesfij- Note that 
this constraint is inequality, because the link transmission rate 
can be larger than the actual data traffic. The third constraint 
shows the relationship between the network and link layer per- 
flow data rates. The last constraint shows that the vector of 
link transmission rates, h = {h\, ...,hi, ...Hl) should be the 
element of the available link rates; Y. Note that T is different 
than rWt) in the sense that T is characterized with the loss 
probability over each link; pi, V7 € C, rather than the channel 
state vector; C(t). 

The first and second constraints are key to our work, because 
they determine the incoming and outgoing flow relationships 
at the network and link layers, respectively. Such an approach 
separates routing from scheduling, and assigns the routing to 
the network layer and scheduling to the link layer. Note that 
if these constraints are combined in such a way that incoming 
rate from a node and exogenous traffic should be smaller than 
the outgoing traffic for each flow, we obtain the backpressure 
solution lfT3l. f!4l. 



L(x,f,h,u,v) = J2(9s(x s ) -u a o(s) x s ) + u iflo 
ses ieAf ses jeAf 

ieM seS jeN (i, 3 )ecses (i,j)ec 

(5) 

Eq. §5§ can be decomposed into several intuitive problems such 
as flow control, routing, and scheduling. 
£ First, we solve the Lagrangian with respect to x s : 

*s = (9's)- 1 (K(s } ) > (6) 

where (.g'J -1 is the inverse function of the derivative of g s . 
This part of the solution is interpreted as the flow control. 

Second, we solve the Lagrangian for /? • and hf •. The 
following part of the solution is interpreted as the routing. 



ieAf ses jeAf (ij)ecses 
s.t. /;; ; lr, r yieAf,jeAf,seS 

The above problem is equivalent to; 

{i,j)eCses 



(7) 



(8) 



Third, we solve the Lagrangian for hij. The following part 
of the solution is interpreted as scheduling. 



max Vijhij 

s.t. hef. 



(9) 



The decomposed parts of the Lagrangian, i.e., Eqs. ((6]), 
(O, (O as well as the Lagrange multipliers; uf and Vij can 
be solved iteratively via a gradient descent algorithm. The 
convergence properties of this iterative algorithm are provided 
in |fT51 . Next, we propose Diff-Max based on the structure of 
the decomposed solution. 

3 Note that uf and V4 j are Lagrange multipliers. Although they are 
interpreted as the representation of the queue sizes, they are not actual queue 
sizes, but the functions of them. On the other hand, Uf and Vi j are actual 
queue sizes. 



B. Diff-Max 
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Now, we provide stochastic control strategy including rout- 
ing, scheduling, and flow control. The strategy, i.e., Diff-Max, 
which mimics the NUM solution, combines separated routing 
and scheduling together with the flow control strategy. 

Diff-Max: 

• Routing. Node i observes the network layer queue back- 
logs in all neighboring nodes at time t and determines; 

ff .(t) = S Fr ^ if U '® ~ U W ~ Vi <i® > ° 
4J 10, otherwise 

(10) 

where F" lax is constant larger than the maximum out- 
going rate from node i. According to Eq. ( TUJl i. f*j(t) 
packets are removed from Uf(t) and inserted in the 
link layer queue Vij(t). This routing algorithm mimics 
Eq. ([8]l and has the following interpretation. Packets 
from flow s can be transmitted to the next hop node 
j as long as the network layer queue in the next hop 
(node j) is small, which means that node j is able 
to route the packets, and the link layer queue at the 
current node (node i) is small, which means that the 
congestion over link i — j is relatively small. Note that 
if the number of packets in Uf(t) is limited, the packets 
are transmitted to the link layer queues beginning from 
the largest U?(t) - 17/ (t) - V %0 {t). 
The routing algorithm in Eq. ( fUjl i uses per-link queues 
as well as per-fiow queues, which is the main difference 
of Eq. ( [Tol l as compared to backpressure routing. The 
backpressure routing only uses per-flow queues, and does 
not take into account the state of the link layer queues 
(they do not exist due to formulation). 

• Scheduling. At each time slot t, link rate hij(t) is 
determined by; 

max Vij(t)h it j(t) 

h — * 

tth(i)er C(t)l V(i,j)e/: (11) 

This scheduling algorithm mimics Eq. (0 and has the fol- 
lowing interpretation. The link i—j with the largest queue 
backlog Vij, by taking into account the channel state 
vector; C(t), should be activated, and a packet(s) from 
the corresponding queue (Vij) should be transmitted. 
We note that this problem (scheduling or max-weight) 
is known to be a hard problem, |9), lf]~3l . Therefore, we 
propose sub-optimal scheduling algorithms that interact 
well with the routing algorithm in Eq. dT/Ob . 
The scheduling algorithm in Eq. (fTTT i differs from the 
classical backpressure in the sense that it is completely 
independent from the routing. In particular, Eq. (fTTT i 
makes the scheduling decision based on per-link queues; 
Vi.j and the channel state; C(t), while the classical 
backpressure uses maximum queue backlog differences 
dictated by the routing algorithm. As it is seen the routing 
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Fig. 3. Diff-Max operations at end-points and intermediate nodes. 

and scheduling are operating jointly in backpressure, 
while in Diff-Max, these algorithms are separated. 
• Flow Control. At every time slot t, the flow/rate con- 
troller at the transport layer of node i determines the 
current level of network layer queue backlogs Uf(t) 
and determines the amount of packets that should be 
transported from the transport layer to the network layer 
according to: 

max [Mg s (x s {t))-U?(t)x s (t)} 

sGS\i—o(s) 

s.t. x s (t)<RT ax (12) 

s£S\i—o{s) 

where R™^ is a constant larger than the maximum 
outgoing rate from node i, and M is a constant parameter, 
M > 0. The flow control part of our solution mimics 
Eq. © as well as the flow control algorithm proposed in 
@. 

The discussions on the analysis and performance bounds of 
Diff-Max are provided in |[T5l . 

IV. System Implementation 

We propose practical implementations of Diff-Max (Fig. O 
as well as Diff-subMax, which combines the routing algorithm 
with a sub-optimal scheduling, and wDiff-subMax which 
makes routing decision based on a window-based algorithm. 

A. Diff-Max 

1 ) Flow Control: The flow control algorithm, implemented 
at the transport layer at the end nodes (see Fig. [3}, determines 
the rate of each flow. We implement our flow control algorithm 
as an extension of UDP in our simulator ns-2 and in our 
Android testbed. 

The flow control algorithm, at the source node i, divides 
time into epochs (virtual slots) such as t\, tf, t^, where 
t\ is the beginning of the fcth epoch. Let us assume that t^ +1 = 

+ Ti where is the epoch duration. 

At time t\, the flow control algorithm determines the rate 
according to Eq. ( fT2l . We consider g s (x s (t)) = log(x s (t)) 
(note that any other concave utility function can be used). After 
x s (t^) is determined, corresponding number of packets are 
passed to the network layer, and inserted to the network layer 
queue Uf. Note that there might be some excessive packets 
at the transport layer if some packets are not passed to the 



network layer. These packets are stored in a reservoir at the 
transport layer, and transmitted in later slots. At the receiver 
node, the transport protocol receives packets from the lower 
layers and passes them to the application. 

2) Routing: The routing algorithm, implemented at the 
network layer of each node (both the end and intermediate 
nodes) (see Fig. 01, determines routing policy, i.e., the next 
hop(s) that packets are forwarded. 

The first part of our routing algorithm is the neighbor 
discovery and queue size information exchange. Each node 
i transmits a message containing the size of its network layer 
queues; U?. These messages are in general piggy-backed 
to data packets. The nodes in the network operates on the 
promiscuous mode. Therefore, each node, let us say node j, 
overhears a packet from node i even if node i transmits the 
packet to another node, let us say node k. Node j reads the 
queue size information from the data packet it receives or 
overhears (thanks to operating on the promiscuous mode). 
The queue size information is recorded for future routing 
decisions. Note that when a node hears from another node 
through direct or promiscuous mode, it classifies it as its 
neighbor. The neighbor nodes of node i forms a set Mi- As 
we mentioned, queue size information is piggy-backed to data 
packets. However, if there is no data packet for transmission 
for some time duration, the node creates a packet to carry 
queue size messages and broadcast it. 

The second part of our routing algorithm is the actual rout- 
ing decision. Similar to the flow control algorithm, the routing 
algorithm divides time into epochs; such as tl,t?, ■■■,t i , 
where i t - is the beginning of the kth epoch at node i. Let us 
assume that ^ 1 = t k + T( where T[ is the epoch duration. 
Note that we use and T[ instead of t\ and Tj, because these 
two time epochs do not need to be the same nor synchronized. 

At time t^, the routing algorithm at the network layer 
checks Ufitf) - (7/(i- fe ) - V^(i- fc ) for each flow s. Note 
that Uj(t i k ) is not the instantaneous value of 11? at time 
ti, instead it is the latest value of U? heard by node i 
before t t . Note also that Vij(tf) is the per-link queue at 
node i, and this information should be passed to the network 
layer for routing decision. According to Eq. ( [Tol l, f%,j{t k ) is 
determined, and fi,j(t i ) packets are removed from Uf and 
inserted to the link layer queue Vij at node i. Note that the 
link layer transmits packets from Vij only to node j, hence 
the routing decision is completed. The routing algorithm is 
summarized in Algorithm Q] Note that Algorithm Q] considers 
that there are enough packets in U* for transmission. If not, 
the algorithm lists all the links j 6 Mi in decreasing order, 
according to the weight; U?{tf) - U?(t' i k ) - V ijj (t i k ). Then, 
it begins to route packets beginning from the link that has the 
largest weight. 

3) Scheduling: The scheduling algorithm in Eq. ( fTTT ) as- 
sumes that time is slotted, and determines the links that 
should be activated and the (number of) packets that should be 
transmitted at each time slot. Although there are time-slotted 
system implementations, and also recent work on backpres- 



Algorithm 1 The routing algorithm at node i for packets from 
flow s at slot t t . 

1: for Vj G A/j do 

2: Read the network layer queue size information of neighbors: U^(t i h ') 
3: Read the link layer queue size information: Vij(t i ) 
4: if U°(t\ k ) - U%(t\ k ) - V i , j (t i k ) > then 

5: fi,j(ti k ) = F r a * 

6: else 

7: /i,j(*i fc ) = 

8: Remove fi,j(t k ) packets from U% 

9: Pass fi j{t i ) packets to the link layer and insert them to Vij 



sure implementation over time-slotted wireless networks [0, 
IEEE 802.11 MAC, an asynchronous medium access protocol 
without time slots, is the most widely used MAC protocol 
in the current wireless networks. Therefore, we implement 
our scheduling algorithm (Eq. ( fTTT )) on top 802.11 MAC (see 
Fig. [3]) with the following updates. 

The scheduling algorithm constructs per-link queues at the 
link layer. Node i knows its own link layer queues, Vij, 
and estimates the loss probability and link rates. Let us 
consider that pi and Ri are the estimated values of pi and 
Ri, respectively, pi is calculated as one minus the ratio of 
correctly transmitted packets over all transmitted packets in a 
time window over link Z0 Ri is calculated as the average of the 
recent (in a window of time) link rates over link I. Vij, pij, 
and Ri j are piggy-backed to the data packets and exchanged 
among nodes. Note that this information should be exchanged 
among all nodes in the network since each node is required 
to make its own decision based on global information. Also, 
each node knows the general topology and interfering links. 

The scheduling algorithm that we implemented mimics 
Eq. ( fTTT >. Each node i knows per-link queues, i.e., Vu es- 
timated loss probabilities, i.e., pi, and link rates, i.e., Ri, 
for I 6 £ as well all maximal independent sets, which 
consist of links that are not interfering. Let us assume that 
there are Q maximal independent sets. For the qth maximal 
independent set such that q = 1,...,Q, the policy vector is; 
TT q = {-Kg, ...,w q , ...,Ttq}, where ir q = 1 if link / is in the 
qth maximal set, and ir l q = 0, otherwise. Our scheduling 
algorithm selects q*th maximal independent set such that 
q* = argmax Vg {X] ie £ Vj(l -pi)Riir l q }. Node i solves q* as 
one of the parameters; Vj, pi, Ri change VI G C. If, according 
to q*, node i decides that it should activate one of its links, then 
it reduces the contention window size of 802.11 MAC so that 
node i can access the medium quickly and transmit a packet. If 
node i should not transmit, then the scheduling algorithm tells 
802.11 MAC that there are no packets in the queues available 
for transmission. Note that we update 802.1 1 MAC protocol so 
that we can implement the scheduling algorithm in Diff-Max. 
The scheduling algorithm is summarized in Algorithm [2] 

Note that Algorithm [2] is a hard problem, because it reduces 

4 Note that we do not use instantaneous channel states Ci(t) in our 
implementation, since it is not practical to get this information. Even if one 
can estimate C;(t) using physical layer learning techniques, Ci(t) should be 
estimated V( G C, which is not practical in current wireless networks. 



Algorithm 2 Diff-Max scheduling algorithm at node i, 

1 : if Vi , pi , or Ri is updated such that / £ C then 

2: Determine q* such that q* — arg max V(? {^ ;e £ Vl (1 — Pl)-^ 71 "^} 

3: if 3(i, j) such that = 1, Vj" G A/j then 

4: Reduce 802.11 MAC contention window size and access the medium 

5: Transmit a packet from Vij according to FIFO rule 

6: else 

7: Tell 802.11 MAC that there are no packets in the queues available for 
transmission 



to maximum independent set problem, (9), ff3l . Furthermore, 
it introduces significant amount of overhead; each node needs 
to know every other node's queue sizes and link loss rates. Due 
to the hardness of the problem and overhead, we implement 
this algorithm for small topologies over ns-2 for the purpose 
of comparing its performance with sub-optimal scheduling 
algorithms, which we describe next. 

B. Diff-subMax 

Diff-subMax is a low complexity and low overhead coun- 
terpart of Diff-Max. The flow control and the routing parts 
of Diff-subMax is exactly the same as in Diff-Max. The only 
different part is the scheduling algorithm, which uses 802.11 
MAC protocol without any changes. When a transmission op- 
portunity arises according to underlying 802.1 1 MAC at time t, 
then the scheduling algorithm of node i calculates weights for 
all outgoing links to its neighbors. Let us consider link i—j at 
time t. The weight is u>i,j(i) = Vij(t)(l—pij)Rij. Based on 
the weights, the link is chosen as; I* = arg max, u>ij(t). 
This decision means that a packet from the link layer queue 
Vi* is chosen according to FIFO rule and transmitted. Note 
that this scheduling algorithm only performs intra-scheduling, 
i.e., it determines from which link layer queue, packets should 
be transmitted, but it does not determine which node should 
transmit, which is handled by 802.11 MAC. 

Diff-subMax reduces the complexity of the algorithm and 
overhead significantly. In particular, each node i calculates and 
compares weights w,- for each neighbor node. Therefore, 
the complexity is linear with the number of (neighbor) nodes. 
The overhead is also significantly reduced; each node needs 
to know the queue size of only its one-hop away neighbors. 

C. wDiff-subMax 

wDiff-subMax is an extension of Diff-subMax for the sce- 
narios that link layer operations and data exchange (between 
the network and link layers) are not possible due to wifi 
firmware or driver restrictions or may not be preferable. 
Therefore, wDiff-subMax does not employ any scheduling 
mechanism, but the routing and flow control. The flow con- 
trol algorithm is the same as in Diff-Max. Yet, the routing 
algorithm is updated as explained in the next. 

Eq. dTOb requires per-flow queues as well as per-link queues 
for routing decision. If per-link queues are not available 
at the network layer, these parameters should be estimated. 
wDiff-subMax, window-based routing algorithm, implements 
Eq. ( [Tol l by estimating per-link queue sizes. In particular, the 
routing algorithm sends a window of packets, and receive 



acknowledgement (ACK) for each transmitted packet. The 
ACK mechanism has three functions: (i) carries per-flow queue 
size information, (ii) provides reliability, i.e., packets which 
are not ACKed are re-transmitted, (iii) estimates per-link queue 
sizes. The algorithm works as follows. 

At time t k , the window size for link i — j is Wij(t k ), 
the average round trip time of the packets is RTTij, and the 
average round trip time of the packets in the last window is 
RTT^t'f). If C//« fc ) - > and RTT^t'f) < 

RTTij, then W l . 3 {t' i k ) is increased by 1. If <7/(<- fe ) - 
U?{tf) > and 'RTT^ s (t'f) > RTT id , then W l ^(t[ k ) is 
decreased by 1. If none of the packets in the last window is 
ACKed, Wij^'i") is halved. After Wi,j(tf) is determined, 
fi,j(t k ) is set to Wi,j(t k ) and fi,j{t k ) packets are passed 
to the link layer. wDiff-subMax, similar to Diff-subMax, 
reduces computational complexity and overhead significantly 
as compared to Diff-Max. 

V. Performance Evaluation 

A. Numerical Simulations 

We first simulate our scheme, Diff-Max as well as classical 
backpressure in an idealized time slotted system in our in- 
house simulator. The simulation results show that Diff-Max 
performs as good as the classical backpressure. Next, we 
discuss the simulation setup and results in detail. 

We consider the triangle and diamond topologies shown in 
Figs.Ua) and|4|b). In the triangle topology, there are two flows 
between sources; Si, S% and receivers; Ri, i?2, respectively. 
Si is originated from node A and ends at node B, and S2 is 
originated from node A and ends at node C. In the diamond 
topology, there are two flows between sources; Si, S2 and 
receivers; Hi, R2, respectively. Si is originated at node A 
and ends at node B, and S2 is originated at node A and ends 
at node D. In both topologies, all nodes are capable of relaying 
packets to their neighbors. The simulation duration is 10000 
slots, and each simulation is repeated for 10 seeds. Each slot is 
either on ON or OFF state according to the loss probability, 
which i.i.d. over slots and uniformly distributed at each slot. 

Fig. [5] shows throughput vs. the loss probability for the 
triangle topology. The loss is only over link A — C. Fig. [5ja) 
shows the total throughput of the two flows, i.e., from Si 
to i?i and S2 to R2, while Fig. |5jb) and Fig. [5J C ) present 
individual throughput of flows from Si to Ri and S2 to 
i?2, respectively. As it is seen, both the total throughput and 
individual throughput in Diff-Max scheme is equal to the ones 
in the classical backpressure. This observation is confirmed 
for different loss scenarios and for the diamond topology in 
Figs, m g] 

B. ns-2 Simulations 

In this section, we simulate our schemes, Diff-Max, Diff- 
subMax, wDiff-subMax as well as classical backpressure in 
the ns-2 simulator 1 10 1. The simulation results show that Diff- 
Max, Diff-subMax and wDiff-subMax significantly improves 
throughput as compared to the adaptive routing scheme; Ad 



(a) Triangle topology 



(b) Diamond topology 



(c) Grid topology 



Fig. 4. Topologies used in simulations, (a) Triangle topology. There are two flows between sources; Si, S2 and receivers; Hi, R2, i.e., from node A to B 
(Si - R\) and from node A to G (S2 - R2)- (b) Diamond topology. There are two flows between sources; Si, S2 and receivers; Ri, R2, i.e., from node 
A to B (Si - Ri) and from node A to D (S2 - i?2)- (c) Grid topology. 12 nodes are randomly placed over 4x3 grid. An example node distribution and 
possible flows are illustrated in the figure. 




(a) Total throughput vs. loss probability (b) Throughput of flow from Si to Ri vs. loss (c) Throughput of flow from 52 to R2 vs. loss 

probability probability 

Fig. 5. Triangle topology shown in Fig. |4[a). The loss is over link A — C. (a) Total throughput (sum of the throughput of flows from Si to Ri and 52 to 
R2) vs. loss probability, (b) Throughput of flow from Si to Ri vs. loss probability, (c) Throughput of flow from S2 to R2 vs. loss probability. 
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(a) Total throughput vs. loss probability (b) Throughput of flow from Si to Ri vs. loss (c) Throughput of flow from S2 to R2 vs. loss 

probability probability 

Fig. 6. Triangle topology shown in Fig. |4ja). The loss is over all links, (a) Total throughput (sum of the throughput of flows from Si to Ri and S2 to R2) 
vs. loss probability, (b) Throughput of flow from Si to Ri vs. loss probability, (c) Throughput of flow from S2 to R2 vs. loss probability. 
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(a) Total throughput vs. loss probability 



(b) Throughput of flow from Si to Ri vs. loss (c) Throughput of flow from S2 to R2 vs. loss 
probability probability 

Fig. 7. Diamond topology shown in Fig. [4[b). The loss is over link A — B. (a) Total throughput (sum of the throughput of flows from Si to Ri and S2 to 
R2) vs. loss probability, (b) Throughput of flow from Si to Ri vs. loss probability, (c) Throughput of flow from S2 to R2 vs. loss probability. 



hoc On-Demand Distance Vector (AODV) ATI . Next, we 1) Setup: We considered two topologies: diamond topology 
present the simulator setup and results in detail. shown in Fig. HJb); and a grid topology shown in Fig. SJc). In 




(a) Total throughput vs. loss probability (b) Throughput of flow from Si to Ri vs. loss (c) Throughput of flow from 52 to R2 vs. loss 

probability probability 

Fig. 8. Diamond topology shown in Fig. 0{b), The loss is over all links, (a) Total throughput (sum of the throughput of flows from 5i to Ri and 52 to 
R2) vs. loss probability, (b) Throughput of flow from 5i to Ri vs. loss probability, (c) Throughput of flow from 52 to R2 vs. loss probability. 



the diamond topology, the nodes are placed over 500m x 500m 
terrain. Two flows are transmitted from node A to nodes B 
and D. In the grid topology, 4x3 cells are placed over a 
800m x 600m terrain. 12 nodes are randomly placed to the 
cells. In the grid topology, each node can communicate with 
other nodes in its cells or with the ones in neighboring cells. 
Four flows are generated randomly. 

We consider CBR traffic. CBR flows start at random times 
within the first 5sec and are on until the end of the simulation 
which is lOOsec. The CBR flows generate packets with inter- 
arrival times 0.01ms. IEEE 802.11b is used in the MAC 
layer (with updates for Diff-Max implementation as explained 
in Section [IV}. In terms of wireless channel, we simulated 
a Rayleigh fading channel with average channel loss rates 
0, 20, 30, 40, 50%]j We have repeated each lOOsec simulation 
for 10 seeds. 

The channel capacity is 1Mbps, the buffer size at each 
node is set to 1000 packets, packet sizes are set to 1000B. 
We compare our schemes; Diff-Max, Diff-subMax, and wDiff- 
subMax with AODV, in terms transport-level throughput. 

The Diff-Max parameters are set as follows. For the flow 
control algorithm; T t = 80ms, R\ nax = 20 packets, M = 200. 
For the routing algorithm; T { = 10ms, FJ nax = 4 packets. 

2) Results: Fig. [9] presents simulation results in ns-2 simu- 
lator over diamond and grid topologies for different loss rates. 

Fig. |9|a) shows the results for the diamond topology. The 
loss rate is over the link between nodes A and B. Diff-Max 
performs better than the other schemes for the range of loss 
rates. The reason is that Diff-Max activates links based on per- 
link queue backlogs, loss rates, and link rates. On the other 
hand, Diff-subMax, wDiff-subMax, and AODV uses classical 
802.11 MAC, which provides fairness among the competing 
nodes for the medium, which is not utility optimal. When 
the loss rate over link A — B increases, the total throughput 
of all the schemes reduces as expected. As it can be seen, 
the decrease of our schemes; Diff-Max, Diff-subMax, wDiff- 
subMax is linear, while the decrease of AODV is quite sharp. 
The reason is that when AODV experiences loss over a path, 
it deletes the path and re-calculates new routes. Therefore, 

5 We consider the loss rates in the range up to 50%, because recent studies 
of IEEE 802.11b based wireless mesh networks 1171 . 1181 , have reported 
packet loss rates as high as 50%. 



AODV does not transmit over lossy links for some time period 
and tries to find new routes, which reduces throughput. 

Fig. |9jb) elaborates more on the above discussion. It shows 
the throughput of two flows A to B and A to D as well as 
their total value when the loss rate is 10% over link A — B. As 
it can be seen, the rate of flow A — B is very low in AODV as 
compared to our schemes, because AODV considers the link 
A — B is broken at some periods during the simulation, while 
our schemes continue to transmit over this link. 

Let us consider Fig. |9ja) again. Diff-subMax and wDiff- 
subMax improve throughput significantly as compared to 
AODV thanks to exploring routes to improve utility (hence 
throughput). The improvement of our schemes over AODV 
is up to 22% in this topology. Also, Diff-subMax and wDiff- 
subMax have similar throughput performance, which emphasis 
the benefit of routing part and the effective link layer queue 
estimation mechanism of wDiff-subMax. 

Fig. |9ja) also shows that when loss rate is 50%, the 
throughput improvement of all schemes are similar, because 
at 50% loss rate, link A — B becomes very inefficient, and 
all of the schemes transmit packets mostly from flow A to D 
over path A — C — D and have similar performance at high 
loss rates. 

Fig. |9jb) shows the results for the grid topology. The 
throughput improvement of our schemes is higher than AODV 
for all loss rates in the grid topology and higher as compared to 
the improvement in the diamond topology, e.g., the improve- 
ment is up to 33% in the grid topology. The reason is that 
AODV is designed to find the shortest paths, but our schemes 
are able to explore interference free paths even if they are not 
the shortest paths, which is emphasized in larger topologies. 

C. Android Prototype 

We consider a scenario in which a group of smartphones 
collaborate in the same geographical area. In our setting, 
we use four Android 4.0 lfl2l based Galaxy Nexus phones, 
and configure them to operate in ad-hoc mode over Wifi. 
We implement our wDiff-subMax scheme (flow control and 
routing) as an extension of UDP socket. 

We first consider a scenario in which two phones (A and B) 
are connected to each other. Phone A transmits AMB audio 
file to phone B. The transmission time for wDiff-subMax was 
16sec which is comparable with its TCP counterpart, which 




(a) Diamond topology. Total throughput vs. aver- (b) Diamond topology. Throughput of different (c) Grid topology. Total throughput vs. average loss 
age loss rate policies rate 

Fig. 9. Total throughput vs. average loss rate for different policies and in two different topologies, (a) Total throughput vs. average loss rate in diamond 
topology, (b) Total and per-flow throughput for different policies when the average loss rate is set to 10% in the diamond topology, (c) Total throughput vs. 
average loss rate in grid topology. 



was lAsec. This example shows the efficiency of our algorithm 
as an extension of UDP, which causes packet losses or too long 
transmission times. 

In the second scenario, we placed/separated phones to be 
able to create a topology similar to the diamond topology 
shown in Fig. Htb). In this setup, phone A transmits 4MB 
audio file to phone D either using phone B or C as a relay. 
We first consider TCP connection over the path A — B — D 
and configure phone B so that it drops relaying packets after 
lOsec transmission. As expected, TCP connection fails when 
B stops relaying packets. On the other hand, wDiff-subMax 
continues transmission even after B stops, by relaying packets 
using phone C, and completes the transmission in 40sec. 

VI. Related Work 

Backpressure and follow-up work. This paper builds on 
backpressure, a routing and scheduling framework over com- 
munication networks (T|, (2), which has generated a lot of 
interest in the research community |[T6l ; especially for wireless 
and-hoc networks Q9), EO], ED, E2, ED, El- Also, it 
has been shown that backpressure can be combined with flow 
control to provide utility-optimal operation guarantee ||3|, l23l - 
This paper follows the main idea of backpressure framework, 
and revisit it considering the practical challenges that are 
imposed by the current networks. 

Backpressure implementation. The strengths of the back- 
pressure framework have recently increased the interest on 
practical implementation of backpressure over wireless net- 
works. Multi-path TCP scheme is implemented over wireless 
mesh networks J6), where TCP flows are transmitted over mul- 
tiple pre-determined paths and packets are scheduled accord- 
ing to backpressure scheduling algorithm. At the link layer, 
flU, (5), |25l , 11261 propose, analyze, and evaluate link layer 
backpressure-based implementations with queue prioritization 
and congestion window size adjustment. The backpressure 
framework is implemented over sensor networks (7| and 
wireless multi-hop networks |8 j, which are also the most close 
implementations to ours. Our main differences are that; (i) 
we consider separation of routing and scheduling to make 
practical implementation easier, (ii) we design and analyze a 
new scheme; Diff-Max, (iii) we simulate and implement Diff- 
Max over ns-2 and android phones. 



Backpressure and Queues. According to backpressure 
framework, each node constructs per-flow queues. There is 
some work in the literature to stretch this necessity. For 
example, |j27l , fl28l propose using real per-link and virtual 
per-flow queues. Such a method reduces the number of queues 
required in each node, and reduces the delay. Although this 
approach reduces the backpressure framework to make routing 
decision using virtual queues and scheduling decision using 
the real per-link queues by decoupling routing and scheduling, 
it does not separate routing from scheduling. Therefore, this 
approach requires strong synchronization between the network 
and link layers, which is difficult to implement in practice as 
explained in Section U 

VII. Conclusion 

In this paper, we proposed Diff-Max, a framework that sep- 
arates routing and scheduling in backpressure-based wireless 
networks. Diff-Max improves throughput significantly. Also, 
the separation of routing and scheduling makes practical im- 
plementation easier by minimizing cross-layer operations and 
it leads to modularity. Our design is grounded on a network 
utility maximization (NUM) formulation of the problem and 
its solution. Simulations in ns-2 demonstrate the performance 
of Diff-Max as compared adaptive routing schemes, such as 
AODV The evaluations on an android testbed confirm the 
efficiency and practicality of our approach. 
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